Speech coding and synthesis using parametric curves
نویسندگان
چکیده
Accurate modeling of co-articulation, the contextsensitive merging of the boundaries between allophones in continuous speech, is vital for natural sounding speech synthesis. This paper describes initial research investigating the use of Bézier Curves to form models of co-articulation in human speech. A 12th order, pitch synchronous line spectral pair (LSP) [1] analysis is performed on a corpus of 239 phonetically balanced sentences of English speech. The resulting data are divided to form an inventory of the diphones occurring in the speech database. The trajectory of each line spectral pair parameter through each diphone can then be represented by a single cubic Bézier curve segment, found using the LevenbergMarquardt curve fitting method [2, 3]. Results are presented showing the accuracy of Bézier models of the coarticulation between different types of speech sounds.
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملWideband Parametric Speech Synthesis Using Warped Linear Prediction
This paper studies the use of warped linear prediction (WLP) for wideband parametric speech synthesis. As the sampling frequency is increased from the usual 16 kHz, linear frequency resolution of conventional linear prediction (LP) cannot efficiently model the speech spectrum. By using frequency warping that weights perceptually the most important formant information, spectral models with bette...
متن کاملA Simple Continuous Excitation Model for Parametric Vocoding
We describe a continuous-pitch parametric vocoder suitable for speech coding and statistical text to speech synthesis. The spectral model is based on linear prediction. We show that glottal modelling techniques from recent literature can be cherry-picked to produce an excitation signal with properties known to be useful in the above application areas. We further show that the continuous pitch p...
متن کاملStatistical parametric speech synthesis with a novel codebook-based excitation model
Speech synthesis is an important modality in Cognitive Infocommunications, which is the intersection of informatics and cognitive sciences. Statistical parametric methods have gained importance in speech synthesis recently. The speech signal is decomposed to parameters and later restored from them. The decomposition is implemented by speech coders. We apply a novel codebook-based speech coding ...
متن کاملLow-Dimensional Representation of Spectral Envelope Without Deterioration for Full-Band Speech Analysis/Synthesis System
A speech coding for a full-band speech analysis/synthesis system is described. In this work, full-band speech is defined as speech with a sampling frequency above 40 kHz, whose Nyquist frequency covers the audible frequency range. In prior works, speech coding has generally focused on the narrowband speech with a sampling frequency below 16 kHz. On the other hand, statistical parametric speech ...
متن کامل